Now the first Rothenberg quote is often assumed to apply in the second case, but there’s clear evidence emerging that data are readable well beyond 5 years. Most files on my hard disk that have been brought forward from previous computers (with very different operating systems) remain readable even after 15 years. Most, but not all; in my case, the exceptions are my early PowerPoint files, created with PowerPoint 4 on a Mac in the 1990s, unreadable with today’s Mac PowerPoint version. However, these files are not completely inaccessible to me, as a colleague with a Windows machine has a different version of PowerPoint that can read these files and save them in a format that I can handle. It’s a comparatively simple, if tedious, migration.
I have spent some time and asked in various fora for evidence of genuinely obsolete, that is completely inaccessible data (Rusbridge, 2006). There are some apocryphal stories and anecdotes, but little hard evidence so far. But perhaps complete inaccessibility isn’t the point? Perhaps the issue is more about risk to content, on the one hand, or extent of information loss on the other? Maybe this isn’t a binary issue (inaccessible or not), but a graded obsolescence scale? Is there such a thing? I couldn’t find one (there’s a hint of one in a NTIS report abstract (Slebodnick, Anderson, Fagan, & Lisez, 1998), but the text isn’t online, and it appears to be about obsolescence of people)!
So here are a couple of attempts, for comment.
Here’s what might be called a Category approach. Users would be asked to select one only from the following categories (it worries me that they are neither complete nor non-overlapping!):
A Completely usable, not deemed at riskHere’s a more subjective, but perhaps more complete scale. Users are asked to select a number from 1 to 5 representing where they assess their data lies, on a scale whose ends are defined asB Currently usable, at risk
- Open definition with no significant proprietary elements
- multiple implementations, at least one Open Source
- high quality treatment in secondary implementations
- widespread use
C Currently usable, at significant risk
- Closed definition, or
- Open definition, but with significant proprietary elements
- More than one implementation
- imperfect treatment in secondary implementations
D Currently inaccessible, migration path known
- Proprietary/closed definition
- single proprietary implementation
- Not widespread use
E Completely inaccessible
- Proprietary tools don't run on current environment
- Potentially imperfect migration
- no known method of extraction or interpretation
1 Completely usable, not deemed at risk, andAnd here’s a more Likert-like scale. Users are asked to nominate their agreement with a statement such as “My data are completely inaccessible”, using the Likert scale:
5 Completely inaccessible
1 Strongly disagreeHmmm. How could you neither agree nor disagree with that statement? “Excuse me, I haven’t a clue”?
2 Disagree
3 Neither agree nor disagree
4 Agree
5 Strongly Agree
Comments on this blog, please; but if you’re commenting elsewhere, can you use the tag “Obsolescence scale”, and maybe drop me a line? Thanks!
Rothenberg, J. (1995). Ensuring the longevity of digital documents. Scientific American, 272(1), 42. (Accessed through Ebscohost)
Rusbridge, C. (2006). Excuse Me... Some Digital Preservation Fallacies? [Electronic Version]. Ariadne from http://www.ariadne.ac.uk/issue46/rusbridge/.
Slebodnick, E. B., Anderson, C. D., Fagan, P., & Lisez, L. (1998). Development of Alternative Continuing Educational Systems for Preventing the Technological Obsolescence of Air Force Scientists and Engineers. Volume I. Basic Study.
Hi Chris,
ReplyDeleteThis morning I posted an article on "Obsolescence scale" on our weblog http://eisenduurzaamdigitaaldepot.blogspot.com, but since it is in Dutch, I'll give you a short translation.
In 1997 I bought STOA, a collection of poems that was only published on a 3.5" disk in WordPerfect 5.1 and Word 6.0. With the Rothenberg-dogma in mind, I expected to have a nice object to illustrate his point. But alas!
The disk was perfectly OK (in my organisation al desktops are still equiped to read those disks) and both files can effortless be opened in Word 2002. So again, no good example of total loss.
But I'll keep an eye open.
It's not clear from your post whether you're referring to formats 'on hand' for existing data or to making format choices for creating data from here on. In either case, it may be very difficult for content creators to even know the answers to some of the proposed questions.
ReplyDeleteIn discussing format choices, I often advocate the use of standards based open formats by suggesting some key characteristics. Formats should:
* be based on freely available standards.
* be developed by a community of interest rather than by a single entity.
* have multiple implementations in software.
* have no intellectual property restrictions (eg patents) attached.
Also, I suggest that obsolescence is something that can never be judged until some years (decades?) after it has taken place.
Thought provoking stuff anyway.
--
Michael Carden